Identification of Topic and Focus in Czech: Evaluation of Manual Parallel Annotations

نویسندگان

  • Sárka Zikánová
  • Miroslav Týnovský
  • Jirí Havelka
چکیده

is paper presents results of a control annotation of the Topic-Focus Articulation of Czech sentences based on the notion of “aboutness”. is is one of the steps testing the hypothesis about the relation between contextual boundness and “aboutness”. We suppose that the bipartition of the sentence into its Topic and Focus (“aboutness”) can be automatically derived from the values of contextual boundness assigned to each node of the dependency tree representing the underlying structure of the sentence. For the testing of this hypothesis, control manual parallel annotations have been carried out. e principles of the controll annotations are described and preliminary results are reported on.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

(Pre-)Annotation of Topic-Focus Articulation in Prague Czech-English Dependency Treebank

The objective of the present contribution is to give a survey of the annotation of information structure in the Czech part of the Prague Czech-English Dependency Treebank. We report on this first step in the process of building a parallel annotation of information structure in this corpus, and elaborate on the automatic pre-annotation procedure for the Czech part. The results of the pre-annotat...

متن کامل

In Search of the Best Method for Sentence Alignment in Parallel Texts

After a brief account of a parallel corpus project involvingmany diverse languages and a summary of two previous evaluations of sentential alignment tools, results are presented from tests of three automatic aligners on English-Czech and French-Czech literary and legal texts, clean and noisy. The results confirm that an alignment tool may performwell on one type of texts and fail on another typ...

متن کامل

Transitions thématiques : Annotation d'un corpus journalistique et premières analyses (Manual thematic annotation of a journalistic corpus : first observations and evaluation) [in French]

Manual thematic annotation of a journalistic corpus : first observations and evaluation. The work presented in this paper focuses on the creation of a corpus of journalistic texts annotated at dicourse level, more precisely on a topic level. The annotation model is a classic segmentation one, to which we add transition zones between topical units. We assume that in a well-structured text, the a...

متن کامل

Utilization of a 17 Microsatellites Set For Bovine Traceability in Czech Cattle Populations

For identification of individuals and parentage control performed by cattle breeders in the Czech Republic, a novel Finnish Bovine Genotypes™ Panel 3.1was amplified by means of one multiplex polymerase chain reaction. Bovine Panel encompasses all the 12 STR loci recommended by the International Society for Animal Genetics (ISAG) for routine use in parentage testing and identification, including...

متن کامل

Unsupervised Negation Focus Identification with Word-Topic Graph Model

Due to the commonality in natural language, negation focus plays a critical role in deep understanding of context. However, existing studies for negation focus identification major on supervised learning which is timeconsuming and expensive due to manual preparation of annotated corpus. To address this problem, we propose an unsupervised word-topic graph model to represent and measure the focus...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Prague Bull. Math. Linguistics

دوره 87  شماره 

صفحات  -

تاریخ انتشار 2007